Overview

Brought to you by YData

Dataset statistics

Number of variables8
Number of observations10001358
Missing cells52158
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 GiB
Average record size in memory119.2 B

Variable types

Numeric7
Categorical1

Alerts

CNT_INSTALMENT is highly overall correlated with CNT_INSTALMENT_FUTUREHigh correlation
CNT_INSTALMENT_FUTURE is highly overall correlated with CNT_INSTALMENTHigh correlation
SK_DPD is highly overall correlated with SK_DPD_DEFHigh correlation
SK_DPD_DEF is highly overall correlated with SK_DPDHigh correlation
NAME_CONTRACT_STATUS is highly imbalanced (85.0%) Imbalance
SK_DPD_DEF is highly skewed (γ1 = 66.33990581) Skewed
CNT_INSTALMENT_FUTURE has 1185960 (11.9%) zeros Zeros
SK_DPD has 9706131 (97.0%) zeros Zeros
SK_DPD_DEF has 9887389 (98.9%) zeros Zeros

Reproduction

Analysis started2025-02-02 10:18:20.110300
Analysis finished2025-02-02 10:26:06.894337
Duration7 minutes and 46.78 seconds
Software versionydata-profiling vv4.12.2
Download configurationconfig.json

Variables

SK_ID_PREV
Real number (ℝ)

Distinct936325
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1903216.6
Minimum1000001
Maximum2843499
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size76.3 MiB
2025-02-02T15:56:07.118599image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1000001
5-th percentile1083660
Q11434405
median1896565
Q32368963
95-th percentile2749774
Maximum2843499
Range1843498
Interquartile range (IQR)934558

Descriptive statistics

Standard deviation535846.53
Coefficient of variation (CV)0.28154784
Kurtosis-1.216184
Mean1903216.6
Median Absolute Deviation (MAD)466541
Skewness0.044229231
Sum1.9034751 × 1013
Variance2.871315 × 1011
MonotonicityNot monotonic
2025-02-02T15:56:07.272412image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1856103 96
 
< 0.1%
2706683 96
 
< 0.1%
1617536 96
 
< 0.1%
1364606 96
 
< 0.1%
1057553 96
 
< 0.1%
2764503 96
 
< 0.1%
1271619 96
 
< 0.1%
1326802 96
 
< 0.1%
2181136 96
 
< 0.1%
1276472 96
 
< 0.1%
Other values (936315) 10000398
> 99.9%
ValueCountFrequency (%)
1000001 3
 
< 0.1%
1000002 5
 
< 0.1%
1000003 4
 
< 0.1%
1000004 8
< 0.1%
1000005 11
< 0.1%
1000007 5
 
< 0.1%
1000008 10
< 0.1%
1000009 7
< 0.1%
1000010 11
< 0.1%
1000011 13
< 0.1%
ValueCountFrequency (%)
2843499 11
< 0.1%
2843498 7
 
< 0.1%
2843497 21
< 0.1%
2843495 8
 
< 0.1%
2843494 3
 
< 0.1%
2843492 13
< 0.1%
2843491 10
 
< 0.1%
2843490 7
 
< 0.1%
2843489 25
< 0.1%
2843488 11
< 0.1%

SK_ID_CURR
Real number (ℝ)

Distinct337252
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean278403.86
Minimum100001
Maximum456255
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size76.3 MiB
2025-02-02T15:56:07.450194image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum100001
5-th percentile117988
Q1189550
median278654
Q3367429
95-th percentile438533
Maximum456255
Range356254
Interquartile range (IQR)177879

Descriptive statistics

Standard deviation102763.75
Coefficient of variation (CV)0.36911753
Kurtosis-1.1968387
Mean278403.86
Median Absolute Deviation (MAD)88944
Skewness-0.0031282533
Sum2.7844167 × 1012
Variance1.0560387 × 1010
MonotonicityNot monotonic
2025-02-02T15:56:07.596177image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
265042 295
 
< 0.1%
172612 247
 
< 0.1%
309133 246
 
< 0.1%
127659 245
 
< 0.1%
185185 245
 
< 0.1%
197583 245
 
< 0.1%
203046 244
 
< 0.1%
362661 239
 
< 0.1%
398407 237
 
< 0.1%
228307 235
 
< 0.1%
Other values (337242) 9998880
> 99.9%
ValueCountFrequency (%)
100001 9
 
< 0.1%
100002 19
 
< 0.1%
100003 28
 
< 0.1%
100004 4
 
< 0.1%
100005 11
 
< 0.1%
100006 21
 
< 0.1%
100007 66
< 0.1%
100008 83
< 0.1%
100009 64
< 0.1%
100010 11
 
< 0.1%
ValueCountFrequency (%)
456255 71
< 0.1%
456254 20
 
< 0.1%
456253 17
 
< 0.1%
456252 7
 
< 0.1%
456251 9
 
< 0.1%
456250 30
< 0.1%
456249 13
 
< 0.1%
456248 43
< 0.1%
456247 27
 
< 0.1%
456246 7
 
< 0.1%

MONTHS_BALANCE
Real number (ℝ)

Distinct96
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-35.012588
Minimum-96
Maximum-1
Zeros0
Zeros (%)0.0%
Negative10001358
Negative (%)100.0%
Memory size76.3 MiB
2025-02-02T15:56:07.777513image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-96
5-th percentile-85
Q1-54
median-28
Q3-13
95-th percentile-4
Maximum-1
Range95
Interquartile range (IQR)41

Descriptive statistics

Standard deviation26.06657
Coefficient of variation (CV)-0.74449138
Kurtosis-0.71068083
Mean-35.012588
Median Absolute Deviation (MAD)18
Skewness-0.67277715
Sum-3.5017343 × 108
Variance679.46607
MonotonicityNot monotonic
2025-02-02T15:56:07.944476image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-10 216441
 
2.2%
-11 216023
 
2.2%
-9 215558
 
2.2%
-12 214716
 
2.1%
-8 214149
 
2.1%
-13 210950
 
2.1%
-7 210229
 
2.1%
-14 208352
 
2.1%
-6 206849
 
2.1%
-15 204935
 
2.0%
Other values (86) 7883156
78.8%
ValueCountFrequency (%)
-96 36448
0.4%
-95 38514
0.4%
-94 39900
0.4%
-93 41025
0.4%
-92 42283
0.4%
-91 43652
0.4%
-90 45295
0.5%
-89 47763
0.5%
-88 49950
0.5%
-87 51805
0.5%
ValueCountFrequency (%)
-1 94908
0.9%
-2 169529
1.7%
-3 183589
1.8%
-4 193147
1.9%
-5 200726
2.0%
-6 206849
2.1%
-7 210229
2.1%
-8 214149
2.1%
-9 215558
2.2%
-10 216441
2.2%

CNT_INSTALMENT
Real number (ℝ)

High correlation 

Distinct73
Distinct (%)< 0.1%
Missing26071
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean17.08965
Minimum1
Maximum92
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size76.3 MiB
2025-02-02T15:56:08.792121image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q110
median12
Q324
95-th percentile45
Maximum92
Range91
Interquartile range (IQR)14

Descriptive statistics

Standard deviation11.995056
Coefficient of variation (CV)0.70189007
Kurtosis2.4468723
Mean17.08965
Median Absolute Deviation (MAD)6
Skewness1.6017338
Sum1.7047417 × 108
Variance143.88137
MonotonicityNot monotonic
2025-02-02T15:56:09.114791image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12 2496845
25.0%
24 1517472
15.2%
10 1243449
12.4%
6 1065500
10.7%
18 727394
 
7.3%
36 584574
 
5.8%
8 303751
 
3.0%
48 278513
 
2.8%
4 238223
 
2.4%
30 211920
 
2.1%
Other values (63) 1307646
13.1%
ValueCountFrequency (%)
1 24544
 
0.2%
2 26826
 
0.3%
3 43081
 
0.4%
4 238223
 
2.4%
5 136840
 
1.4%
6 1065500
10.7%
7 106714
 
1.1%
8 303751
 
3.0%
9 148355
 
1.5%
10 1243449
12.4%
ValueCountFrequency (%)
92 1
 
< 0.1%
84 5
 
< 0.1%
81 1
 
< 0.1%
77 4
 
< 0.1%
72 1519
< 0.1%
71 3
 
< 0.1%
70 2
 
< 0.1%
68 1
 
< 0.1%
66 108
 
< 0.1%
64 3
 
< 0.1%

CNT_INSTALMENT_FUTURE
Real number (ℝ)

High correlation  Zeros 

Distinct79
Distinct (%)< 0.1%
Missing26087
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean10.48384
Minimum0
Maximum85
Zeros1185960
Zeros (%)11.9%
Negative0
Negative (%)0.0%
Memory size76.3 MiB
2025-02-02T15:56:09.427757image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median7
Q314
95-th percentile35
Maximum85
Range85
Interquartile range (IQR)11

Descriptive statistics

Standard deviation11.109058
Coefficient of variation (CV)1.0596363
Kurtosis3.7133282
Mean10.48384
Median Absolute Deviation (MAD)5
Skewness1.8467459
Sum1.0457915 × 108
Variance123.41116
MonotonicityNot monotonic
2025-02-02T15:56:09.757403image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1185960
 
11.9%
6 614058
 
6.1%
4 613632
 
6.1%
5 600295
 
6.0%
3 582007
 
5.8%
2 547199
 
5.5%
1 512279
 
5.1%
10 481390
 
4.8%
8 480167
 
4.8%
7 472665
 
4.7%
Other values (69) 3885619
38.9%
ValueCountFrequency (%)
0 1185960
11.9%
1 512279
5.1%
2 547199
5.5%
3 582007
5.8%
4 613632
6.1%
5 600295
6.0%
6 614058
6.1%
7 472665
 
4.7%
8 480167
4.8%
9 467606
 
4.7%
ValueCountFrequency (%)
85 1
 
< 0.1%
84 1
 
< 0.1%
83 1
 
< 0.1%
82 1
 
< 0.1%
81 1
 
< 0.1%
80 1
 
< 0.1%
72 35
< 0.1%
71 33
< 0.1%
70 32
< 0.1%
69 32
< 0.1%

NAME_CONTRACT_STATUS
Categorical

Imbalance 

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size603.1 MiB
Active
9151119 
Completed
 
744883
Signed
 
87260
Demand
 
7065
Returned to the store
 
5461
Other values (4)
 
5570

Length

Max length21
Median length6
Mean length6.2331193
Min length3

Characters and Unicode

Total characters62339658
Distinct characters27
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowActive
2nd rowActive
3rd rowActive
4th rowActive
5th rowActive

Common Values

ValueCountFrequency (%)
Active 9151119
91.5%
Completed 744883
 
7.4%
Signed 87260
 
0.9%
Demand 7065
 
0.1%
Returned to the store 5461
 
0.1%
Approved 4917
 
< 0.1%
Amortized debt 636
 
< 0.1%
Canceled 15
 
< 0.1%
XNA 2
 
< 0.1%

Length

2025-02-02T15:56:10.034972image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-02T15:56:10.320964image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
active 9151119
91.3%
completed 744883
 
7.4%
signed 87260
 
0.9%
demand 7065
 
0.1%
returned 5461
 
0.1%
to 5461
 
0.1%
the 5461
 
0.1%
store 5461
 
0.1%
approved 4917
 
< 0.1%
amortized 636
 
< 0.1%
Other values (3) 653
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 10763273
17.3%
t 9919118
15.9%
i 9239015
14.8%
A 9156674
14.7%
v 9156036
14.7%
c 9151134
14.7%
d 850873
 
1.4%
o 761358
 
1.2%
p 754717
 
1.2%
m 752584
 
1.2%
Other values (17) 1834876
 
2.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 62339658
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 10763273
17.3%
t 9919118
15.9%
i 9239015
14.8%
A 9156674
14.7%
v 9156036
14.7%
c 9151134
14.7%
d 850873
 
1.4%
o 761358
 
1.2%
p 754717
 
1.2%
m 752584
 
1.2%
Other values (17) 1834876
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 62339658
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 10763273
17.3%
t 9919118
15.9%
i 9239015
14.8%
A 9156674
14.7%
v 9156036
14.7%
c 9151134
14.7%
d 850873
 
1.4%
o 761358
 
1.2%
p 754717
 
1.2%
m 752584
 
1.2%
Other values (17) 1834876
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 62339658
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 10763273
17.3%
t 9919118
15.9%
i 9239015
14.8%
A 9156674
14.7%
v 9156036
14.7%
c 9151134
14.7%
d 850873
 
1.4%
o 761358
 
1.2%
p 754717
 
1.2%
m 752584
 
1.2%
Other values (17) 1834876
 
2.9%

SK_DPD
Real number (ℝ)

High correlation  Zeros 

Distinct3400
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.606928
Minimum0
Maximum4231
Zeros9706131
Zeros (%)97.0%
Negative0
Negative (%)0.0%
Memory size76.3 MiB
2025-02-02T15:56:10.672549image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4231
Range4231
Interquartile range (IQR)0

Descriptive statistics

Standard deviation132.71404
Coefficient of variation (CV)11.434037
Kurtosis255.3222
Mean11.606928
Median Absolute Deviation (MAD)0
Skewness14.899126
Sum1.1608504 × 108
Variance17613.017
MonotonicityNot monotonic
2025-02-02T15:56:10.986464image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 9706131
97.0%
1 21872
 
0.2%
2 17358
 
0.2%
3 14403
 
0.1%
4 12350
 
0.1%
5 11046
 
0.1%
6 9615
 
0.1%
7 8332
 
0.1%
8 7360
 
0.1%
9 6668
 
0.1%
Other values (3390) 186223
 
1.9%
ValueCountFrequency (%)
0 9706131
97.0%
1 21872
 
0.2%
2 17358
 
0.2%
3 14403
 
0.1%
4 12350
 
0.1%
5 11046
 
0.1%
6 9615
 
0.1%
7 8332
 
0.1%
8 7360
 
0.1%
9 6668
 
0.1%
ValueCountFrequency (%)
4231 1
< 0.1%
4200 1
< 0.1%
4172 1
< 0.1%
4141 1
< 0.1%
4110 1
< 0.1%
4108 1
< 0.1%
4080 1
< 0.1%
4078 1
< 0.1%
4050 2
< 0.1%
4049 1
< 0.1%

SK_DPD_DEF
Real number (ℝ)

High correlation  Skewed  Zeros 

Distinct2307
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.65446842
Minimum0
Maximum3595
Zeros9887389
Zeros (%)98.9%
Negative0
Negative (%)0.0%
Memory size76.3 MiB
2025-02-02T15:56:11.296541image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum3595
Range3595
Interquartile range (IQR)0

Descriptive statistics

Standard deviation32.762491
Coefficient of variation (CV)50.059696
Kurtosis4836.5494
Mean0.65446842
Median Absolute Deviation (MAD)0
Skewness66.339906
Sum6545573
Variance1073.3808
MonotonicityNot monotonic
2025-02-02T15:56:11.577103image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 9887389
98.9%
1 22134
 
0.2%
2 14690
 
0.1%
3 11652
 
0.1%
4 9528
 
0.1%
5 8031
 
0.1%
6 6629
 
0.1%
7 5425
 
0.1%
8 4538
 
< 0.1%
9 3935
 
< 0.1%
Other values (2297) 27407
 
0.3%
ValueCountFrequency (%)
0 9887389
98.9%
1 22134
 
0.2%
2 14690
 
0.1%
3 11652
 
0.1%
4 9528
 
0.1%
5 8031
 
0.1%
6 6629
 
0.1%
7 5425
 
0.1%
8 4538
 
< 0.1%
9 3935
 
< 0.1%
ValueCountFrequency (%)
3595 1
< 0.1%
3565 1
< 0.1%
3534 1
< 0.1%
3506 1
< 0.1%
3475 1
< 0.1%
3468 1
< 0.1%
3444 1
< 0.1%
3438 1
< 0.1%
3414 1
< 0.1%
3407 1
< 0.1%

Interactions

2025-02-02T15:55:09.677013image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:52:57.094901image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:18.896709image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:41.117388image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:04.811396image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:27.558823image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:49.968654image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:55:12.991983image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:00.169243image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:21.678682image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:44.600049image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:07.884133image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:30.671100image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:52.660262image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:55:15.513792image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:03.143918image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:25.111861image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:47.133060image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:11.522995image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:34.058008image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:55.502352image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:55:18.813603image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:06.632842image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:28.169223image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:51.226937image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:14.578116image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:37.184518image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:58.454526image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:55:21.668654image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:09.835130image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:32.319123image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:55.250057image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:17.863980image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:40.060631image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:55:01.734084image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:55:24.789001image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:12.951638image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:35.639816image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:58.607790image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:20.898603image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:43.541232image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:55:04.128927image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:55:27.143829image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:15.821535image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:53:38.328810image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:01.829455image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:23.839439image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:54:46.711271image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-02-02T15:55:07.051702image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-02-02T15:56:11.709458image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
CNT_INSTALMENTCNT_INSTALMENT_FUTUREMONTHS_BALANCENAME_CONTRACT_STATUSSK_DPDSK_DPD_DEFSK_ID_CURRSK_ID_PREV
CNT_INSTALMENT1.0000.7410.3480.081-0.0750.0060.0000.003
CNT_INSTALMENT_FUTURE0.7411.0000.2490.098-0.150-0.032-0.0010.003
MONTHS_BALANCE0.3480.2491.0000.022-0.105-0.0380.0000.002
NAME_CONTRACT_STATUS0.0810.0980.0221.0000.0750.2280.0030.004
SK_DPD-0.075-0.150-0.1050.0751.0000.6060.0020.000
SK_DPD_DEF0.006-0.032-0.0380.2280.6061.0000.0000.001
SK_ID_CURR0.000-0.0010.0000.0030.0020.0001.000-0.000
SK_ID_PREV0.0030.0030.0020.0040.0000.001-0.0001.000

Missing values

2025-02-02T15:55:28.686003image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-02-02T15:55:37.209528image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-02-02T15:55:55.269376image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

SK_ID_PREVSK_ID_CURRMONTHS_BALANCECNT_INSTALMENTCNT_INSTALMENT_FUTURENAME_CONTRACT_STATUSSK_DPDSK_DPD_DEF
01803195182943-3148.045.0Active00
11715348367990-3336.035.0Active00
21784872397406-3212.09.0Active00
31903291269225-3548.042.0Active00
42341044334279-3536.035.0Active00
52207092342166-3212.012.0Active00
61110516204376-3848.043.0Active00
71387235153211-3536.036.0Active00
81220500112740-3112.012.0Active00
92371489274851-3224.016.0Active00
SK_ID_PREVSK_ID_CURRMONTHS_BALANCECNT_INSTALMENTCNT_INSTALMENT_FUTURENAME_CONTRACT_STATUSSK_DPDSK_DPD_DEF
100013482234984191559-226.00.0Active7970
100013492340692104125-246.00.0Active9440
100013502593362198894-2012.00.0Completed00
100013512639809288279-256.00.0Active9250
100013522700641448867-196.00.0Active8430
100013532448283226558-206.00.0Active8430
100013541717234141565-1912.00.0Active6020
100013551283126315695-2110.00.0Active6090
100013561082516450255-2212.00.0Active6140
100013571259607174278-5216.00.0Completed00